Method of and device for forming the sum of a chain of products

ABSTRACT

Digital signal processing often requires the fast summing of a chain of products. Known signal processors often use two separate dam buses via which the values to be multiplied are supplied in parallel, it being assumed that these values originate from different sources, for example from different memories. Because a product of two binary numbers has double the number of positions, therefore, an adder having double the word width is also used. In order to reduce this substantial expenditure at the expense of only a slight reduction in speed, an adder is provided having only the single word width and to process the most-significant and least-significant bits of the product during two successive clock periods. The values to be multiplied can then be supplied successively.

SUMMARY OF THE INVENTION

The invention relates to a method of forming the sum of a chain ofproducts of each time two numbers which are successively supplied, eachintermediate result of the summing operation being temporarily stored,and also relates to a device for forming the sum of a chain of productsof each time a first and a second value, comprising a storage device forstoring a number of values with a predetermined first number of bits, aclock-controlled control device for controlling the writing of values inregisters, a first register device for storing each time two values tobe multiplied by one another, a multiplier device comprising two inputswhich are connected to the first register device, and a downstreamproduct register which comprises an output for double the first numberof bits, an adder device which comprises two sum inputs for each timethe first number of bits, one of said inputs being connectable to theoutput of the product register, and a downstream sum register whichconsists of at least two partial sum registers for each time the firstnumber of bits comprising an output which can be coupled to the otherinput of the adder device.

Methods of this kind are frequently used in digital signal processing,for example for the filming of signal waveforms and devices of this kindare used in many multipurpose signal processors. In order to enable theprocessing of signal sequences with a high frequency, customary signalprocessors comprise two data buses so as to enable the formation of anew product during each clock period. Because, moreover, this productcontains double the number of bits, i.e. has double the word width, theadder device is also designed for double the word width. However, thisrepresents a comparatively high expenditure.

It is an object of the invention to provide a method of the kind setforth which enables the formation of the sum of a chain of products withlittle expenditure and at the expense of only a slightly reduced speed.

This object is achieved in accordance with the invention in that theproduct produced by each separate multiplication is added to theintermediate result in two steps in that during the first step only theleast-significant positions of the product are added to thecorresponding positions of the intermediate result, the first partialsum thus formed being temporarily stored, whereas during the second stepthe remaining, more-significant positions of the product are added tothe corresponding remaining positions of the intermediate sum and thecarry of the first partial sum, the second partial sum thus formed beingtemporarily stored, during each step there being supplied another one ofthe numbers to be directly subsequently multiplied.

It is a further object of the invention to provide a device of the kindset forth which enables the formation of the sum of a chain of productswith less expenditure and at the expense of only a slightly reducedspeed.

This object is achieved in accordance with the invention in that thestorage device is connected to the first register device via only onedata bus for the first number of bits, and that the control device isconceived to operate alternately in a first and a second clock periodand to apply, during the first clock period the first number ofleast-significant bits at the output of the product register and thecontents of the second partial sum register to the inputs of the adderdevice, and to write, at the end of this clock period, one of the firstvalues into the register device and the new partial sum appearing at theoutput of the adder device into the second partial sum register, and toapply, during the second clock period, the first number ofmost-significant bits at the output of the product register and thecontents of the first partial sum register as well as a temporarilystored carry, to the adder device and to write, at the end of this clockperiod, one of the second values into the register device and the secondpartial sum appearing at the output of the adder device, into the firstpartial sum register as well as to write the product formed by themultiplier device into the product register.

Each complete partial sum is thus formed during two clock periods, sothat overall the processing speed is halved. However, this requires onlya single data bus via which the values to be processed are successivelytransferred and, moreover, it suffices to use an adder device for onlythe single word width which successively determines the partial sumsduring the two clock periods. Because initially only the partial sum isformed from the least significant bits, the carry from the first partialsum is always available for the formation of the second partial sum.Consequently, no rounding errors occur, because all positions presentare indeed evaluated, and the exact result can ultimately be extractedfrom the second partial sum register.

BRIEF DESCRIPTION OF THE DRAWINGS

An embodiment of the invention will be described in detail hereinafterwith reference to the drawing. Therein:

FIG. 1 shows a block diagram of significant parts of a device inaccordance with the invention, and

FIG. 2 shows the succession in time of individual values occurring atdifferent locations within the block diagram shown in FIG. 1.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

A control device 10 in FIG. 1 generates two control signals T1 and T2 onthe leads 8 and 9, which signals cyclically alternate. On an output 11the control device 10 also generates addresses for addressing a storagedevice 12. The connection 11 is shown to be double, because it actuallyconsists of a number of parallel leads via which the bits of an addressare transferred in parallel. This also holds for the data connectionswhich are also shown to be double and which also consist of severalleads via which the bits of each time one data word are transferred inparallel.

The data connection of the storage device 12 which may consist of aplurality of separate memories, including read-only memories, isconnected to a data bus 14 whereto, in addition to further devices of aprocessor (not shown), the two registers 16 and 18 are connected inparallel. Write operations in these registers are controlled by controlsignals on the leads 8 and 9.

The outputs 17 and 19 of the registers 16 and 18, intended to store thefactors to be multiplied by one another, are connected to the inputs ofa multiplier device 20 which forms, within one clock period, thecomplete product of the two values supplied via the connections 17 and19 and which outputs the product via the output 21. This product hasdouble the word width, i.e. double the number of bits of the suppliedfactors. This is the only location within the device shown in FIG. 1 inwhich a double word width occurs, and the connection 21 is connected tothe input of a product register 22 which is also designed for thisdouble word width.

At the output side of the product register 22 there are connected twoleads 23 and 29, the lead 23 carrying the most-significant bits whereasthe lead 29 carries the least-significant bits, i.e. each lead has thesingle word width. The connections 23 and 29 lead to a multiplexer 24which connects, under the control of a control signal on the lead 8,either the connection 23 or the connection 29 to an input 25 of an adderdevice 26. The other input 35 of the adder device 26 is connected to theoutput of the second multipliexer 34.

The sum output 27 of the adder 26 is connected in parallel to the inputsof two partial sum registers 30 and 32 which are alternately activatedfor writing via the control leads 8 and 9. The second partial sumregister 32 is also provided with a carry memory 28 in which, inparallel with the second sum register, the carry occurring at the output27a of the adder device 26 is written via the control lead 8.

The outputs 31 and 33 of the partial sum registers 30 and 32 lead to thetwo inputs of the multiplexer 34 which connects, controlled via thecontrol lead 8, either the output 31 together with the output 37 of thecarry memory 28 or the output 33 to the second input 35 of the adderdevice 26. Furthermore, the output 31 carries the desired sum at the endof the processing of the chain of products.

However, it is often desirable that the contents of the register 32 canalso be read and applied to other elements.

The function of the device shown in FIG. 1 will now be described withreference to the time diagram of FIG. 2 in which the numbers precedingthe individual lines denote the signals or values on the connections orthe contents of the blocks denoted by the relevant references. Thedevice shown inn FIG. 1 serves to determine scalar products, i.e.expressions of the form ##EQU1## The individual values A_(i) and B_(i)are present in stored form, because they are required at definedinstants. The intermediate sums, arising after the summing of each newproduct P_(i) =A_(i) ·B_(i), are always temporarily stored, thepreceding intermediate sum being erased thereby. The intermediate sumformed after the last product at the same time constitutes the finalresult.

In the time diagram of FIG. 2 it is initially assumed that the firstproducts have already been processed and that the correspondingintermediate sum has been formed. Thus, the instant t₀ at the end of asecond clock period T2 represents an arbitrarily selected instant duringthe processing of the chain of products. At this instant t₀ one factorA_(i) of the current product P_(i) to be formed, read from the storagedevice 12 during this clock period T2, is written into the register 16,i.e. by the ascending signal on the line 8 for the first clock periodT1. At the same time the previous product P_(i-1), formed in themultiplier device 20, is written into the product register 22. Thepartial sum registers 30 and 32 will be considered in detailhereinafter.

During the clock period T1, subsequent to the instant t₀, the secondvalue B_(i) of the product to be formed is read from the storage device12 and applied to the input of the register 18 via the data bus 14. Thesignal on the control lead 9, being low during this clock period,controls the multiplexer 24 so that the connection 29, carrying theleast-significant bits of the product P_(i-1), is coupled to the input25 of the adder device 26, and at the same time the output 33 of thepartial sum register 32 is coupled to the input 35 of the adder device26, via the multiplexer 34 which is controlled in the same way. Theadder device 26 forms the partial sum S^(L) _(i-1) during the clockperiod T1 and outputs it via the output 27 near the end of the clockperiod T1.

At the end of the clock period T1, i.e. at the instant t₁ when thesignal on the control lead 8 becomes low and that on the control lead 9becomes high, the least-significant half of the partial sum thus formedis written into the second sum register 32 and at the same time thecarry signal output via the output 27a is written into the carry memory28. Moreover, the second factor B_(i) of the product to be formed iswritten into the register 18. Furthermore, the two multiplexers 24 and34 are switched over so that now the input 25 of the adder device 26receives the most-significant bits of the product P_(i-1) at the output23 of the product register 22 whose contents have remained the same, andat the same time the input 35 of the adder device 26 receives themore-significant bits, stored in the first partial sum register 30, ofthe preceding partial sum as well as the carry, output via the output37, from the carry memory 28, so that the most-significant part of thenew intermediate sum is now formed in the adder device 26 and output viathe output 27; at the same time the multiplier device 20 forms the nextproduct P_(i), because both values A_(i) and B_(i) are present in theregisters 16 and 18. Moreover, the control device 10 addresses the firstvalue A_(i+x) for the next product P_(i+x) and applies this value, viathe bus 14, inter alia to the register 16.

At the end of the clock period T2, i.e. at the instant 12, the controlsignal on the lead 8 becomes high, so that the value A_(i+1) is writteninto the register 16, and at the same time the product P_(i) output bythe multiplier 20 via the output 21 is written into the product register22; furthermore, the most-significant part of the previous intermediatesum S^(H) _(i-1) is written into the first partial sum register 30. Theprocessing of the product P_(i-1) has thus been completed, and theproduct P_(i) then stored in the product register 22 can be furtherprocessed.

This takes place during the second clock period T1 shown, during whichthe multiplexer 24 is switched over again applies and theleast-significant bits of the product P_(i) at the output 29 of theproduct register 22 to the input 25 of the adder device 26. At the sametime the other input 35 of the adder device 26 receives the precedingpartial sum S^(L) _(i-1), via the multiplexer 34, from the output 33 ofthe second partial sum register 32, and the new least-significantpartial sum S^(L) _(i) is formed at the output 27 of the adder device 26and written into the second partial sum register 32 at the end of theclock period T1, i.e. at the instant t₃. At the same time the secondvalue B_(i+1), read during the clock period T1, is written into theregister 18.

During the third clock period T2 shown, the next valid product P_(i+1)is formed at the output 21 of the multiplier 20, and at the same timethe most-significant bits from the product register 22 are applied tothe input 25 of the adder device 26, the input 35 receiving themost-significant preceding partial sum S^(H) _(i-1) from the firstpartial sum register 30 as well as the carry signal from the carrymemory 28, so that the new most-significant partial sum S^(H) _(i) isformed at the output 27 of the adder device 26. This partial sum iswritten into the first partial sum register 30 again at the end of thethird clock period T2, shown.

This process is cyclically continued until all products of the chainhave been calculated and processed by accumulation in the partial sumregisters. As is shown in FIG. 2, processing takes place according tothe pipeline principle, i.e. during the formation of one product at thesame time the preceding partial sum is formed and the values of the nextproduct are supplied. At the start, i.e. upon formation of the firstproduct P₁, therefore, some preliminary processing steps are required.

These steps can be seen from FIG. 2 when i=1 is assumed. This means thatprior to the instant t₀ first one value A must be read and written intothe register at the instant t₀. Subsequently, the second value B₁ isread and written into the register 18 at the instant t₁. Subsequently,the first product P₁ can be formed so as to be written into the productregister 22 at the instant t₂. The product P₁ can then be processed inthe described manner; evidently, the partial sum registers 30 and 32must have been erased during these preceding processing steps, i.e. theymust contain the value zero.

The sum of a chain of products is thus formed while maintaining completeaccuracy, without rounding errors, and requiring only an adder for thesingle word width.

We claim:
 1. A method of forming the sum of a chain of products of eachtime two numbers which are successively supplied, each intermediateresult of the summing operation being temporarily stored, characterizedin that the product produced by each separate multiplication is added tothe intermediate result in two steps, in that during the first step onlythe least-significant positions of the product are added by an adder tothe corresponding positions of the intermediate result, the firstpartial sum and carry thus formed being temporarily stored in a firstpartial sum register, and a carry register, and during the second stepthe remaining, more-significant positions of the product are added bysaid adder to the corresponding remaining positions of the intermediatesum and carry of the first partial sum, the second partial sum thusformed being temporarily stored in a second partial sum register, duringeach step there being supplied another one of the numbers to be directlysubsequently multiplied.
 2. A device for forming the sum of a chain ofproducts of each time a first and a second value, comprising a storagedevice for storing a number of values with a predetermined number ofbits, a clock-controlled device for controlling the writing of values inregisters, a first register device for storing each time two values tobe multiplied by one another, a multiplier device comprising two inputswhich are connected to the first register device, and a downstreamproduct register which comprises an output for double the first numberof bits, an adder device which comprises two sum inputs for each timethe first number of bits, one of said inputs being connectable to theoutput of the product register, and a downstream sum register whichconsists of at least first and second partial sum registers for eachtime the first number of bits, comprising an output selectively coupledto the other input of the adder device, characterized in that thestorage device (12) is connected to the first register device (16, 18)via only one data bus (14) for the first number of bits and that thecontrol device (10) operates alternatively in a first and a second clockperiod and to apply, during the first clock period, the first number ofleast-significant bits at the output of the product register (22) andthe contents of the first partial sum register (32) to the inputs of theadder device (26), and to write, at the end of this clock period, one ofthe first values (B_(i)) into the register device (16) and the firstpartial sum appearing at the output of the adder device (26) into thefirst partial sum register (32), and a carry into a temporary carrystorage register, and to apply, during the second clock period, thefirst number of most-significant bits at the output of the productregister (22) and the contents of the second partial sum register andthe temporarily stored carry to the adder device (26) and to write, atthe end of this clock period, one of the second values (A_(i+1)) intothe register device (16, 18) and the second partial sum, appearing atthe output of the adder device (26), into the second partial sumregister and to write the product formed by the multiplier device (20)into the product register (22).
 3. A device as claimed in claim 2,further comprising a first multiplexer (24) connected between the outputof the product register (22) and one sum input of the adder device (26),a second multiplexer (34) connected between the outputs of the partialsum registers (30, 32) and the other input of the adder device (26), thecontrol device (10) switching the multiplexers after each clock period.4. A method of forming the sum of a chain of products of a plurality ofsets of numbers each having a predetermined data width, which aresuccessively supplied, each intermediate result of the summing operationbeing temporarily stored, comprising the steps of:(a) sequentiallyreceiving the plurality of numbers through a common pathway; (b) forminga product of the numbers; (c) sequentially summing each portion of theproduct with each corresponding portion of an accumulated value, withcarry from any preceding partial summing operation, in order of portionsfrom least to most significance, and updating the corresponding portionof the accumulated value as the computed sum of the portion andcorresponding portion, with carry to any succeeding partial summingoperation, the portion and corresponding portion having thepredetermined data width.
 5. The method according to claim 4, furthercomprising the step of providing clock cycles, a new one of saidplurality of numbers being received and a new partial sum being computedon each successive clock cycle.
 6. The method according to claim 4,wherein said plurality of numbers are successively supplied through adata bus.
 7. The method according to claim 4, wherein two numbers arereceived, and the product is divided into two portions for two partialsumming operations.
 8. The method according to claim 4, wherein thenumbers are stored prior to said product forming step.
 9. The methodaccording to claim 4, further comprising the step of storing the productof the received numbers in a register.
 10. The method according to claim4, wherein each set includes two numbers, wherein the product is summedwith the accumulated value in two steps, a first step in which aleast-significant portion of the product is added to the correspondingportion of the accumulated value, and a second step in which amost-significant portion of the product is added to the correspondingportion of the accumulated value, during each step there being suppliedanother one of the numbers to be subsequently multiplied.
 11. A devicefor forming the sum of a chain of products of a plurality of values,sequentially receiving sets of values to be multiplied, comprising:(a) adata bus having a predetermined data width; (b) a plurality of registersfor storing a number of values received from said data bus, each havinga predetermined data width; (c) a multiplier receiving the values fromthe plurality of registers and computing a product; (d) a productregister for storing the product, having a data width in excess of thepredetermined data width; (e) an accumulator, having selectivelyaddressable portions for storing data; (f) a carry register; (f) anadder having two sum inputs, a sum output, a carry input and a carryoutput, a first input receiving a selected portion of stored data insaid accumulator, a second input receiving a corresponding portion ofsaid stored product from the product register, said carry inputreceiving data stored in said carry register, said sum output having adata width smaller than said product register, and said sum outputoutputting a sum of said first and second inputs; and (g) means forcontrolling in sequence the storing of sequential sets of values in saidplurality of registers; selectively addressing portions of saidaccumulator; selectively addressing corresponding portions of saidstored product; replacing said selected portion of stored data in saidaccumulator with said sum output; and selectively supplying said data insaid carry register to said carry input.
 12. The device according toclaim 11, further comprising:(a) a first multiplexer connected betweensaid accumulator and said first input of said adder, selectivelyaccessing said portions of said accumulator; and (b) a secondmultiplexer connected between said product register and said second suminput of said adder, selectively accessing corresponding portions ofsaid product register, said controlling means controlling said first andsecond multiplexers to select said portions of said accumulator and saidcorresponding portions of said product register, in sequential order ofleast significance to most significance, and supplying said data in saidcarry register to said carry input, except for in conjunction withportions of least significance.
 13. The device according to claim 11,wherein the number of values is two, and said controlling meanscomprises a clock operating alternatively in first and second clockperiods, being for:(a) during said first clock period:(i) selecting afirst portion of said accumulator including the least-significant bitsand said corresponding portion of said stored product for input to saidadder; (ii) storing, at the end of said first clock period, the firstvalue into a first register, respectively; and (iii) storing said carryoutput and partial sum from said sum output, into said carry registerand said first portion of said accumulator; and (b) during said secondclock period:(i) selecting a second portion of said accumulatorincluding the most-significant bits, said data stored in said carryregister, and said corresponding portion of said stored product forinput to said adder; (ii) storing, at the end of said second clockperiod, the second value into a second register; (iii) storing a partialsum from said sum output into said second portion of said accumulator;and (iv) storing said product in said product register.
 14. The deviceaccording to claim 13, further comprising:(a) a first multiplexerconnected between said accumulator and said first input of said adder,selectively accessing said first portion and said second portion of saidaccumulator; and (b) a second multiplexer connected between said productregister and said second sum input of said adder, selectively accessingsaid first corresponding portion and said second corresponding portionof said product register, said controlling means switching said firstand second multiplexers to select said first portion and said firstcorresponding portion or said second portion and said secondcorresponding portion, respectively, after each clock period.